Recently, there has been interest in automatically generated word classes for improving sta-tistical machine translation (SMT) quality: e.g, (Wuebker et al, 2013). We create new mod-els by replacing words with word classes in features applied during decoding; we call these “coarse models”. We find that coarse versions of the bilingual language models (biLMs) of (Niehues et al, 2011) yield larger BLEU gains than the original biLMs. BiLMs provide phrase-based systems with rich contextual information from the source sentence; because they have a large number of types, they suffer from data sparsity. Niehues et al (2011) miti-gated this problem by replacing source or target words with parts of speech (POSs). We vary their approach in two ways: ...
Abstract. In this paper, we present a language model based on clusters obtained by applying regular ...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
Bilingual language models (Bi-LMs) refer to language models that are modeled using both source and t...
Automatically clustering words from a mono-lingual or bilingual training corpus into classes is a wi...
This paper presents a novel approach to improve reordering in phrase-based ma-chine translation by u...
We propose a method to improve the accuracy of parsing bilingual texts (bitexts) with the help of st...
This paper presents a novel approach to improve reordering in phrase-based ma-chine translation by u...
2014-07-28The goal of machine translation is to translate from one natural language into another usi...
We investigate how to improve bilingual embedding which has been successfully used as a feature in p...
We investigate how to improve bilingual embedding which has been successfully used as a feature in p...
[[abstract]]In this paper, we propose a method for learning reordering model for BTG-based statistic...
In current phrase-based SMT systems, more training data is generally better than less. However, a la...
There have been many recent investigations into methods to tune SMT systems using large numbers of s...
We use target-side monolingual data to extend the vocabulary of the translation model in statistical...
Abstract. In this paper, we present a language model based on clusters obtained by applying regular ...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
Bilingual language models (Bi-LMs) refer to language models that are modeled using both source and t...
Automatically clustering words from a mono-lingual or bilingual training corpus into classes is a wi...
This paper presents a novel approach to improve reordering in phrase-based ma-chine translation by u...
We propose a method to improve the accuracy of parsing bilingual texts (bitexts) with the help of st...
This paper presents a novel approach to improve reordering in phrase-based ma-chine translation by u...
2014-07-28The goal of machine translation is to translate from one natural language into another usi...
We investigate how to improve bilingual embedding which has been successfully used as a feature in p...
We investigate how to improve bilingual embedding which has been successfully used as a feature in p...
[[abstract]]In this paper, we propose a method for learning reordering model for BTG-based statistic...
In current phrase-based SMT systems, more training data is generally better than less. However, a la...
There have been many recent investigations into methods to tune SMT systems using large numbers of s...
We use target-side monolingual data to extend the vocabulary of the translation model in statistical...
Abstract. In this paper, we present a language model based on clusters obtained by applying regular ...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...